Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition
نویسندگان
چکیده
This paper presents a new approach to tone modeling for continuous Mandarin speech recognition. Mandarin tones provide rich information for speech recognition. In this paper, we treat the tone as an attribute of the final vowel part of a Mandarin syllable. Separate distributions are estimated for cepstral coefficients and pitch features respectively, and the phonetic state tied-mixture technique is exploited to achieve improved modeling. Several tying structures are investigated, and the results are compared with that without using tonal parameters. After integrating tone models, decent improvements can be achieved in large vocabulary continuous Mandarin speech recognition. Besides, this approach can be easily incorporated into the one-pass Viterbi search framework for practical implementation of Mandarin dictation system.
منابع مشابه
Speech Recognition Using Monophone and Triphone Based Continuous Density Hidden Markov Models
Speech Recognition is a process of transcribing speech to text. Phoneme based modeling is used where in each phoneme is represented by Continuous Density Hidden Markov Model. Mel Frequency Cepstral Coefficients (MFCC) are extracted from speech signal, delta and double-delta features representing the temporal rate of change of features are added which considerably improves the recognition accura...
متن کاملA bi-lingual Mandarin/taiwanese (min-nan), large vocabulary, continuous speech recognition system based on the tong-yong phonetic alphabet (TYPA)
In this paper, we describe the first Mandarin/Taiwanese (Min-nan) bi-lingual, continuous speech recognition system for large vocabulary or vocabulary-independent applications. A phonetic transcription system called Tong-yong Phonetic Alphabet (TYPA) is described and used to transcribe the bilingual Mandarin/Taiwanese lexicons. The Right-ContextDependent (RCD) phonetic continuous-density Hidden ...
متن کاملTwo-stream modeling of Mandarin tones
Tone modeling is a critical component for Mandarin largevocabulary continuous-speech recognition systems. In previous work on pitch-feature extraction, we reported character error rate reductions of over 30% over the non-tonal baseline [1]. In this paper, we investigate how best to integrate tone modeling with a Mandarin LVCSR system. The paper focusses on the two-stream method, which is based ...
متن کاملLarge vocabulary Mandarin speech recognition with different approaches in modeling tones
Large vocabulary continuous Mandarin speech recognition has been an important problem for speech recognition researchers for several reasons [1], [3]. First of all, it is a tonal language that requires special treatment for the modeling of tones. There are five tones in Mandarin which are necessary to disambiguate between confusable words. Secondly, the difficulty of entering Chinese by keyboar...
متن کاملModeling Lexical Tones for Mandarin Large Vocabulary Continuous Speech Recognition
Modeling Lexical Tones for Mandarin Large Vocabulary Continuous Speech Recognition
متن کامل